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FOREWORD 


This  mterirn  technical  report  was  prepared  on  Contract  AF  33(615)-1737 
between  Iowa  State  University  of  Science  and  Technology  and  Aerospace  Research 
Laboratories,  Office  of  Aerospace  Research,  United  States  Air  Force.  It 
summarizes  the  research  accomplished  under  the  direction  of  Professors  Oscar 
Kempthorne  and  George  Zyskind,  principal  investigators,  during  the  eighteen- 
month  period  July  1964  through  December  1965.  The  contract  has  been  extended 
through  December  1967  at  which  time  a  final  report  is  scheduled. 

The  work  performed  under  contract  was  initiated  and  coordinated  by  Mary 
D.  Lum,  Research  Mathematical  Statistician,  Applied  Mathematics  Research 
Laboratory,  Aerospace  Research  Laboratories,  and  supported  by  funds  for 
Project  7071,  Research  in  Applied  Mathematics,  Work  Unit  7071-00-10,  Analysis 
of  Variance  and  Probability. 
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ABSTRACT 

Research  on  Analysis  of  Variance  and  Data  interpretation  is  described. 
Section  I  discusses  estimation  problems  in  variance  component  and  mixed 
model  prob’ems.  Section  II  considers  the  combination  of  information  on 
estimable  functions  from  distinct  uncorrelated  sources  and  justifies  some 
of  the  common  applications  in  experimental  design  problems.  Section  III 
discusses  size  and  power  under  experiment  randomization  of  several 
competitive  tests  for  the  paired  design  and  presents  conclusions  about 
the  high  relative  merits  of  the  variance  ratio  randomization  and  the 
Wilcoxon  tests.  Section  IV  discusses  the  development  of  high  speed 
computational  methods  fo*-  the  calculation  of  fourth  degree  generalized 
polykays  of  variances  and  covariances  of  estimated  variance  components 
for  balanced  samples  from  balanced  populations.  Section  V  summarizes 
briefly  papers  on  the  design  of  experiments  and  multivariate  responses  in 
experiments  and  the  1965  Fisher  Memorial  lecture  on  experimental 
inference. 
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INTRODUCTION 

The  research  described  in  this  report  deals  with  aspects  of  linear 
model  methodology  and  with  a  search  for  greater  understanding  of 
consequences  of  sampling  and  randomization  in  experiments.  The 
present  report  summarizes  briefly  work  performed  on  the  contract  and 
dealt  with  in  detail  in  separate  reports  and  papers  now  in  final  stages  of 
preparation.  The  separate  but  related  accounts  deal  with  the  following 
general  topics: 

I.  Unbiased  Estimation  in  Variance  Component  Models 

II.  Simple  Linear  Combinability  of  Information  from  Independent 
Sources 

III.  Size  and  Power  of  Certain  Tests  under  Experiment 
Randomization 

IV.  Computation  of  Variances  of  Estimated  Variance  Components 
in  Finite  Balanced  Population  Structures 

V.  General  Related  and  Broader  Matters 

The  ensuing  sections  delineate  briefly  the  main  results  and  general 
viewpoints  arrived  at  in  investigating  the  above  problems. 
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I.  UNBIASED  ESTIMATION  IN  VARIANCE  COMPONENT 

MODELS 


With  regard  to  variance  component  models  we  have  considered  the 
problem  of  minimum  variance  (M.  V.)  unbiased  estimation  of  regression 
parameters  and  variance  components  in  the  mixed  model 


=  Z  X.y.  + 

i=0  1  1  i=r  +  l 


k+1 

Z  X.  Q. 
1 


where  y.'s  are  fixed  effects,  S.'s  are  random  effects  with  distributional 
i  ri 

properties  to  be  further  specified,  and  X^'  s  are  known  fixed  matrices 
whose  elementd  are  not  necessarily  restricted  to  be  0's  or  l's  .  We 
assume  throughout  that  X^+j  =  I,  E(0.^')  =  0  (i  /j),  and 

E(Pk+i  )  =  1°^+!  •  ^  m°del  representation  is  defined  to  be  balanced  ^ 

if  X^XlXjXj  =  X^XjX^X!  (i/j,  i,  j  =  0, . .  .  ,  k+1  ).  A  representation  that 

is  not  balanced  ^  is  unbalanced. 

Completeness  of  the  sufficient  set  of  statistics  is  established  by  a 
restriction  on  the  number  of  roots  of  V  =  E(yy' )  -  E(y)E(y').  Several 
theorems,  on  the  minimum  variance  properties  under  normality  of 
Model  I  type  A.o.  V.  estimators  for  variance  components,  and  simple 
least  squares  estimators  of  estimable  functions  of  regression  parameters 
for  balanced  ^  mixed  models,  are  proved.  Certain  optimality  properties 
for  the  same  estimators,  when  the  normality  assumption  is  replaced  by  a 
less  stringent  condition,  are  obtained. 

Two  results  for  the  model 
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k+1  k+1 

y  5  j  (i+  2  X.p.  =  2  X.p. 
n  i=l  11  i=0  11 

where  E(p^p|)  =  I<r?  which  are  due  to  Graybill  and  Hultquist  (1961),  and 
which  we  have  refined  are: 

(1)  If  (a)  all  a?  are  -stimable  (b)  X.XIX.X'.  =  X.X!X.X! 

i  i  i  J  J  j  j  i  i 

(i,j  =  k+1)  and  (c)  the  random  (3^  vectors  are  normally 

distributed  then  there  is  a  complete  sufficient  statistic  for  the  parameters 

k+1 

(p,  ...,(r*+1)  if,  and  only  if,  W=  2  X.X!.<x*  +  XQX^  p2  has  k  +  2 

distinct  latent  roots.  The  set  of  complete  sufficient  statistics  consists  of 
y  and  y'P!P^y  (i  =  l,...,k+l)  where  P^'s  are  collections  of  vectors 

of  P  ,  an  orthogonal  matrix  such  that  PWP'  =  A  (diagonal),  and  where 
all  vectors  of  P^  (say)  correspond  to  the  same  latent  root  of  W. 

We  define  the  class  of  situations  of  type  for  which  commutativity  of 
X.X!X,X<  (i,j  =  0,  ...,k+l)  holds  and  W  has  k  +  2  distinct  roots  to  be 

the  class  P. 

(2)  If  p^  and  Pj  are  independent  for  all  i  and  j  (i  /  j)  and  finite 

fourth  moments  exist  for  all  random  variables,  and  within  every  given 
vector  PA  ,  all  fourth  moments  are  equal,  and  all  third  moments  are 
equal,  then  the  same  estimators,  i,  e.  ,  the  usual  Model  I.  A.  o.  V.  mean 
square  estimators  for  the  6.  =  E(y'P^P^y),  that  are  M.  V.  unbiased  under 

normality,  are  best  quadratic  unbiased  (b.  q.u.  )  estimators  under  present 
assumptions. 
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We  have  oLtained  results  analogous  to  1,  and  under  slightl}  more 
extended  restrictions  results  analogous  to  2  above,  for  the  completely 
random  model  under  the  assumption  that  E(p.pp  =  (a^  (i  =  1,  . . .  ,  k) 

where  (a.\b^)  is  a  matrix  with  a^  on  the  diagonal  and  b^  off  it.  The 

same  estimators  as  before  are  complete  sufficient  for 
(»i,  ax  -  bj  , aR  -  bR,  rj.+1)  • 

For  the  mixed  model,  under  the  assumptions  (a)  normality  of  Pi 
vectors  (b)  E(P^P^)  =  I'1'*  (i  =  r+1, . . . ,  k+1  ) 

(c)  X.X'.X.X'.  =  X.X'.X.X!  (i.j  =  0 . k+1)  (d)  the  matrix 

k+1 

W  =  X  X'  |i2  +  2  X.X1  <r2  =  Jjiz  +  V,  where  V  is  the  variance  matrix 

0  0  j  ill 

of  y  in  the  corresponding  completely  random  case,  has  k+2  distinct  roots 
and  (e)  P^.  (i  ^0)  /  (j/k+1)  =  0,  where  the  P^s  are  as  defined 

previously,  we  have  shown  that  the  sufficient  statistic  (Xy^,  s* +1»  *  *  *  ’  sk+l* 

for  the  parameters  (Xy,  <rz+1 . <^+1)  is  complete.  We  have  also  given 

the  counterpart  of  2  above  for  the  mixed  model,  namely  best  linear 
unbiased  (b.  1.  u. )  estimators  for  estimable  functions  of  regression  parameters 
and  b.q.  u.  estimators  for  Variance  components.  We  have  also  presented 
analogo  is  results  under  slightly  more  extended  restrictions  for  a  mixed 

model  with  E(PjPp  =  (a^\o^)  (i  =  r  +  1 . k). 

The  class  of  model  situations  with  E{p.pp  =  I<r?  and  for  which  for  at 
least  some  i.j  (i/j).  X  X!X  X!  /  X  X'X.X!  or  the  number  of  roots  of 

*  “  J  J  J  J 
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W  (or  W)  exceeds  k+2  we  designate  as  the  class  S-P.  In  the  class 
S-P,  a  class  containing  many  design  situations,  some  common  and 
others  less  so,  the  condition  of  balance  ^  is  often  not  satisfied  and  in  all 
of  the  examples  that  we  have  thus  far  examined,  even  if  normality  of  p/s 
is  assumed,  the  minimal  sufficient  set  of  statistics  is  not  complete.  It 
is  not  known  whether  U.M.V.  estimators  exist  in  these  cases,  and  if  they 
do,  how  to  proceed  to  obtain  them.  Since  here  the  assumption  of 
normality  cannot  apparently  be  profitably  used,  and  later  removed,  we 
favor  obtaining  alternative  estimators  directly,  and  comparing  them  at 
different  points  of  the  parameter  space  by  means  of  the  variances  of  each 
variance  component  estimator. 

We  have  giver,  consideration  to  the  simple  "least  squares"  method  of 
estimation  in  unbalanced  cases*  We  present  a  transformation  procedure, 
which  is  actually  a  single  degree  of  freedom  breakdown  of  sums  of  squares, 
and  which  in  random  models  provides  one  means  of  finding  variances  of 
variance  component  estimators.  The  procedure  suggests  theoretically, 
at  least,  an  alternative  way  of  weighting  single  degree  of  freedom  sums  of 
squares  to  find  estimators  v/ith  smaller  variance  than  those  given  by 
simple  least  squares. 

We  have  attacked  the  problem  of  the  va  riance  of  a  quadratic  form,  and 
the  covariance  between  two  forms  that  arise  in  mixed  and  random  models. 
We  have  found  considerable  simplifications  in  tie  case  of  a  usual  least 
squares  method  of  estimation,  also  known  as  Method  3  of  Henderson 
(1953),  and  we  have  found  a  further  simplification  under  the  assumption 
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i 

|. 

i 

f  of  normality  of  random  effects.  We  have  applied  the  general  results 

derived  to  obtain  variance  formulae  for  various  sums  of  squares  which 
have  been  suggested  for  finding  estimators  of  variance  components  in 
random  models  with  added  concomitants. 
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II.  SIMPLE  LINEAR  COMBINABILIT Y  OF  INFORMATION 
FROM  INDEPENDENT  SOURCES 

The  issue  of  combining  information  from  independent  sources  has  long 
been  of  general  interest  to  experimenters  and  statisticians  alike.  A 
common  procedure  has  been  to  combine  estimates  of  scalar  parameters  by 
weighting  inversely  as  the  variances.  This  procedure  is  noi  generally 
best  for  vector  parameters.  We  have  therefore  examined  combinability 
with  data  arising  from  linear  models  of  the  type 

y  =  Xp  +  e 

where  X  is  an  n  x  p  matrix,  and  p  is  a  p  x  1  vector  of  unknown 
parameters.  The  vector  of  errors  e  has  non-singular  covariance  matrix 
V. 

If  one  has  several  independent  sets  of  data  y^  =  X^p  +  e^  with  the  same 
parameter  vector  p  and  respective  non-singular  variance  matrices  V.  , 
there  is  immediate  interest  in  the  simplest  possible  method  of  combining 
information  from  the  several  sources  to  get  the  best  linear  unbiased 
estimator  (b.  1.  u.  e.  )  of  a  parametric  function  X' p  estimable  from  the 
full  set  of  data.  A  commonly  used  assumption  will  be  that  and 

are  essentially  known.  The  report  is  concerned  with  the  specific 
conditions  under  which  the  b.  1.  u.  e.  of  a  X'p,  estimable  in  each 
independent  source,  can  be  obtained  by  simple  weighting  of  the  information 
available  in  each  independent  source.  Special  attention  is  given  to  the 
situation  of  exactly  two  sources  of  information  as,  for  example,  in  the 
case  of  inter  and  intra  block  information  in  incomplete  block  designs. 

The  particular  question  examined  may  be  stated  as  follows:  if  X'P 
is  estimable  from  the  data  y^  =  X^p  +  e^  and  also  estimable  from  the 
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independent  data  =  X^p  +  e2»  when  is  the  b.  1.  u.  e.  X'P*  given  by 

X'p*  =  wX'p  +  (l-w)X'p  , 

where  X'p  and  X'p  are  the  b.  l.u.  e.  *s  from  the  first  and  the  second 
sources  respectively? 

Definition:  An  estimable  parametric  function  X'P  is  said  I 

to  be  best  combinable  by  simple  weighting  (b.  c.  s.w.  )  if  f 

X'P*  =  wX'p  +  (l-w)X'p,  0  <  w  <  1  ,  or 

X'p*  =  X'p  or  X'P*  =  X'p  . 


The  extension  of  this  definition  to  k  >  2  uncorrelated  sources  of 
information  is  obvious. 

The  following  main  theorems  have  been  proved. 


Theorem  1:  A  necessary  and  sufficient  condition  for  X'p  to  be 
b.  c.  s.w.  is  that  the  set  of  solutions  of  the  conjugate  normal  equation 


<X1X'2) 


o 

1 

> 

X1 

o 

C 

)  i 

X, 

6 

2 

p  =  X 


is  identical  with  the  set  of  solutions  to  either  the  pair  of  conjugate  equations 


=  wX 

X2V2lx2p  =  {1’w)X 
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or 


x,iv;1x1  p  *  x  x-v;1x1  p 

or 

x^v^x^  -  0  x^v‘1x2p 


0 

X 


Corollary  1.  1:  A  necessary  and  sufficient  condition  for  a  X'p, 
estimable  in  both  sources,  to  be  b.  c.  s.  w.  is  that  X  be  in|  the  image  of  a 

i  i 

subspace  S  such  that  the  mapping  X'^V^Xj  restricted  to  S  is  a  scalar 


multiple  of  the  mapping  X'2V2*X2  restricted  to  S, 

xiv;lxi  =  xx^-'x, 

s  s 


i.e.  , 


Corollary  1.  2:  A  necessary  and  sufficient  condition  that  X'p  be 
best  estimated  from  source  one  ak  ie,  i.  e.  ,  X'P*  =  X'P  ,  is  that  X  be 

in  the  image  under  the  mapping  X'^V’^X^  of  a  subspace  S  such  that  S 

is  contained  in  the  null  space  of  X'2V2*X2  .  ^ 

Denoting  the  row  space  of  X.  by  .  we  may  state  another  corollary 
of  theorem  1. 


Corollary  1.  3:  A  necessary  and  sufficient  condition  for  X'p*  =  X'P 
for  every  X  in  Xj  is  that  X2  =  °* 

To  simplify  notation  and  facilitate  the  discussion  we  shall  hereafter, 
with  no  real  loss  of  generality,  restrict  V\  to  be  of  the  form  <r?I.  If 
we  restrict  attention  to  vectors  X'  in  XjH  X2  /  °p  we  may  further 
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'  -"tf* 


characterize  the  Bet  of  corresponding  b.  c.b.w.  X'P's  by  observing  that 
X'p  is  b.  c.s.w.  if  and  only  if  X  is  the  image  under  X'^Xj  or  X'2X2  of 

a  vector  p  such  that  p  is  a  generalized  eigenvector  of  X'jXj  -  kX'^X^ 
for  some  generalized  eigenvalue  k  /  0,  i.  e.  , 

(X'jXj  -  kX'2X2)  p  =  0  .  (1) 


Lemma:  If  A  and  B  are  real  pxp  positive  semi-definite 
matrices,  with  rank  (A)  =  a  <  rank  (B)  =  b,  then  there  exists  a 
real  non-singular  matrix  T  and  real  diagonal  matrices  A*  and  B* 
such  that  A*  =  T'AT  and  B*  =  T'BT  where 


A*  = 


1  0 
a 


0  0 


.  B*  = 


£1. 

I 


0  I 


i  K  l 

b-r 1 


p-a-b+r 


and  the  p  ^ ,  i  =  1 , 


r,  are  positive. 


(2) 


The  application  of  lemma  to  A  =  X'^Xj  and  B  =  X'2X2  is  evident. 
Equation  (1)  becomes 
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,  r  1  » 'J.'  ■ 


0  =  T'tX'jXj  -  kX^X2)TT'  P 


l-kfir  | 


II  | 

a  -r 


-WL  | 

|  _ b-r  I 

“  |0 


where  p=  Tr  .  Because  of  the  diagonal  forms  of  T'X^X^T  and 

T'X^X£T  in  (2),  bases  for  the  row  spaces  of  T'X'^T  and  T'X^T 

are  respectively  the  transposed  columns  of  Ej  =  {« j , .  . .  ,  *a  )  and 

E_  =  («, . «  ,  *  ..  ),  where  €.  is  the  column  of  zeros  with 

2  1  r  a+1  a+b-r'  x 

1  in  the  i-th  position.  Since  T  is  non-singular,  any  vector  6  =  Tr  for 

some  unique  T  .  Thus  for  any  vector  6  the  image 

a 

T'X'.X.  6  =  T'X'.X,  Tr  =  E  a.  €.  for  the  proper  coefficients  a.  ,  and 

11  11  i-111  rr  x  * 


thus  for  any  vector  6  the  image  X'jXj  6  =  2  a^  (T')_i  *.  .  Hence  the 

i=  1  1  1 

linearly  independent  columns  of  (T')"*E.  =  (t, ,  t,,  . ... ,  t  )  form  a  basis 

1  x  £t  2L 

for  Xj  •  Similarly  the  linearly  independent  columns  of 

( T ' )  —  (t|i  *  *  ■  1 ,  t&+  j )  •  •  •  *  ^ }  form  z.  basis  for  x^  •  Since 

the  full  set  j t^, . . . ,  ta+j3_r  j  is  also  linearly  independent,  the  set 
is  a  basis  for  x^Xz  . 
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I 


For  a  corresponding  k=p7*,  is  1 . r,  is  a  solution  to  (3) 

and  p.  =  T  «.  is  a  generalized  eigenvector  of  X'.X.  -  p.~*X'X,  .  Thus, 
the  image  of  p^  under  either  mapping  X'jX^  or  X'^X^  is  a  vector  X. 

such  that  X!p  is  b.  c.s.w.  But  X'X,  p.  -  X', X,  T  c.  =  (T')"1  «.  =  t. 

r  llillx''ii 

for  i=  1 . r,  Thus  the  basis,  jtj,  t^, . . . ,  tfJ  (the  first  r  rows  of 

T~*),  of  XjO  X£  constitutes  a  set  of  r  independent  coefficient  vectors 

of  b.  c.  8.w.  linear  parametric  functions.  We  have  therefore  proved  the 
following  theorem. 

Theorem  2:  If  the  rank  of  the  intersection  space  of  the  row  spaces 
of  Xj  and  X^  is  r  then  there  exist  r  linearly  independent  vectors 
X'  in  Xj  H  X2  such  that  X'p  is  b.  c.s.w. 

Theorem  3:  If  a  subset  of  s  generalized  eigenvalues  =  pr*  , 
i  <  r  ,  of  (1)  are  equal  then  there  exists  a  corresponding  s  dimensional 
subspace  of  Xj  O  X2  ^  which  every  vector  X'  is  the  coefficient  of  a 
b. c.s.w.  parametric  function. 

Theorem  4:  A  sufficient  condition  that  X'P  be  b.  c.s.w.  is  that  X 
be  a  common  eigenvector  of  X'^X^  and  X'^X^  . 

Theorem  5:  If  Xj . Xf  is  a  set  of  common  eigenvectors  of 

r 

X'.X.  and  X'X,  then  (  2  a.  X.  )'  p  is  b.  c.s.w.  if  and  only  if 
11  C  C 


1Z 


-V.  iJ'  '*  y  ' 


k  =  c , ,  c?J  =  •  •  •  =  c  ,  c  !  ,  where  c. ,  and  c. ,  are  the  eigenvalues 
11  12  rl  r2  ll  i2  6 

of  A..  and  X'^Xj  and  X'-^X^  respectively. 

The  above  two  theorems  apply  to  the  case  of  k  >  2  uncorrelated 
sources  of  information. 

Making  use  of  the  sufficiency  of  eigenvectors  and  the  fact  that  the 
interblock  and  intrablock  information  matrices  in  any  incomplete  block 
design  have  a  common  orthogonal  diagonalization  we  deduced  the 
following  theorems. 

Theorem  6:  For  incomplete  designs,  a  linear  function  of  the 
treatments,  A'r  ,  is  b.  c.  s.  w.  from  the  interblock  and  intrablock 
sources  of  information  if  and  only  if  A  is  an  eigenvector  of  NN'  where 
N  is  the  treatment  by  block  incidence  matrix. 

Theorem  7:  In  an  incomplete  block  design  (t,  r,  b,  k,  s„  )  a 
necessary  and  sufficient  condition  for  the  interblock  and  Intrablock 
estimates  of  the  set  of  treatment  effects  denoted  by  |t  j,  t^, . .  .  ,  t^j  to  be 

b.  c.  s.w.  is  that  all  treatments  occur  the  same  number  of  times  wich 
treatments  T^,  T^,  . . .  ,  T  . 

Corollary  7.  1:  In  an  incomplete  block  design  ( t,  r,  b,  k,  s„  ),  if 
the  treatment  effects  ^t ...» t^J  are  b.  c.  s.  w.  then  so  are  any  linear 
combinations  of  the  set. 

Corollary  7,  2:  In  an  incomplete  block  design  ( t,  r,  b,  k,  s^  ),  all 
treatment  effects  are  b.  c.  s.w.  if  and  only  if  the  design  has  a  b,  i,  b. 
structure. 
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The  theorem  to  follow  and  the  necessit/  conditions  of  Theorem  7  and 
Corollary  7.  2  were  established  by  Sprott  (1956)  using  manipulations  of 
solutions  to  the  normal  equations  under  the  restrictive  assumption  of 
estimability  in  both  sources. 

Theorem  8:  In  an  incomplete  block  design  ( t,  r,  b,  k,  s. ^  )  a 
necessary  and  sufficient  condition  that  there  exist  a  subset  of  treatments 
T T  ,  such  that  t.  -  t.  is  b.  c.  s.w.  for  all  possible  pairs  in  the 
subset,  is  that  all  pairs  T..  and  Tj  ,  i  j/  j  and  i,  j  <  a,  occur  together 
in  a  block  a  constant  number  of  times  and  that  any  other  treatment  T^ , 

u  >  a,  occur  in  a  block  a  constant  number  of  times  s  with  T., .  . .  ,  T  . 

U  A3 

In  factorial  designs  in  incomplete  blocks,  resolvable  into  uncorrelated 
replications,  each  replicate  consists  of  uncorrelated  interblock  and 
intrablock  sources  of  information  on  the  treatment  parameter  vector  T  . 
Thus  there  are  2r  uncorreiated  sources  of  information  on  T  .  If  we 
denote  the  interblock  information  matrix  of  the  single  i-th  replicate  by 
the  following  theorem  was  easily  established. 

Theorem  9:  In  a  symmetric  factorial  design,  with  complete 
confounding  of  full  sets  of  effect  or  interaction  degrees  of  freedom  within 
replicates,  any  effect  or  interaction  degree  of  freedom  contrast  X't  is 
such  that  X  is  an  eigenvector  of  ,  i  =  1, . .  . ,  r. 

Corollary  9.  1:  In  a  symmetric  factorial  design,  with  complete 
confounding  of  full  sets  of  effect  or  interaction  degrees  of  freedom  within 
replicates,  any  effect  or  interaction  degree  of  freedom  contrast  is  b.  c.  s.w. 
for  the  whole  set  of  2r  interblock  and  intrablock  sources  of  information. 
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III.  SIZE  AND  POWER  OF  CERTAIN  TESTS  UNDER 
EXPERIMENT  RANDOMIZATION 

We  have  conducted  an  investigation  of  the  size  and  power  of  the  F 
test  and  three  non-parametric  tests  in  an  attempt  to  understand  more 
thoroughly  the  consequences  of  experiment  randomization.  In  particular 
we  have  studied  the  behavior  of  tests  applicable  to  a  paired  design  and 
have  further  restricted  the  investigation  to  include  small  samples  only. 
The  test  procedures  examined  in  detail  were  the  Fisher  randomization 
test,  the  Sign  test,  the  Wilcoxon  paired  test  and  the  normal  theory  F 
test.  A  specification  of  these  tests  is  as  follows. 

(a)  The  Fisher  Randomization  Test: 

The  observed  total  difference  is  £x.  .  Let  C  equal  the 
absolute  value  of  this.  Consider  the  absolute  values  of  the 

possible  quantities  22  (  ^  )  x.  ,  where  each  of  the  2° 

i 

different  patterns  of  +  or  -  are  enumerated.  Let  the 

absolute  values  be  C ^ ,  C^, .  . .  ,  C^  ,  where  M  equals  2°. 

The  significance  level  is  the  proportion  of  the  C^  which  equal 

n  - 1 

or  exceed  C^g  .  Actually  one  need  enumerate  only  2 
different  patterns,  because  the  criterion  is  the  absolute  total 
difference. 

(b)  The  Sign  Test: 

Let  the  maximum  of  the  number  of  positive  x^'s  and  the 
number  of  negative  x.'s  be  S  .  Follow  the  same  procedure 
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with  this  criterion.  Actually  wt  do  not  need  to  perform  the 
details,  because  the  possible  values  of  the  criterion  are 

n,  n-1,...,  ,  where  is  |  if  n  is  even  and 

— if  n  is  odd,  and  their  frequencies  are  given  by 
combining  the  tails  of  the  binomial  distribution  for  n  trials 
with  probability  of  success  equal  to  ^  . 

(c)  The  Wilccxon  Paired  Test: 

The  are  ranked  from  smallest  to  largest  disregarding 
signs.  Let  the  maximum  of  the  sum  of  the  ranks  of  the 
negative  observations,  and  the  sum  of  the  ranks  of  the 
positive  observations  be  W^g  .  Follow  the  same  procedure 
with  this  criterion.  The  critical  values  for  small  values  of 
n  and  the  possible  significance  levels  are  given  in  tables, 
for  example  by  Hodges  and  Lehmann  (1963). 

(d)  The  F  Test: 

Ir.  the  case  of  the  paired  design,  the  F  test  is  very  simple: 
calculate  the  criterion  [  treatment  mean  squares  /  error 
mean  square  ]  and  compare  this  value  with  the  chosen 
percentage  point  of  the  F  distribution  with  the  chosen 
percentage  point  of  the  F  distribution  with  1  and  (n-1) 
degrees  of  freedom,  where  n  is  the  number  of  pairs. 

The  objective  of  the  study  was  the  determination  of  the  relative  and 
absolute  performance  of  these  test  procedures  with  regard  to  the  population 
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of  repetitions  induced  by  physical  randomization.  If  we  view  the 

significance  level  as  a  summary  statistic,  a  complete  characterization 

of  the  situation  is  given  by  the  distribution  of  the  significance  level  under 

the  null  hypothesis,  and  the  distribution  of  the  significance  level  under 

i 

the  alternative.  Since  this  is  an  overwhelming  task,  a  common  procedure 
is  to  examine  power  of  tests  which  is  essentially  tail  areas  of  the  distri¬ 
bution  of  the  significance  level  under  the  alternative  hypothesis.  Thus, 
size  and  power  served  as  reasonable  criteria  with  which  to  measure  test 
performance. 

N 

With  N  pairs  observed  in  the  experiment,  there  are  2  possible 
ways  of  applying  two  treatments  within  each  pair,  one  of  which  is 
randomly  chosen  by  the  experimenter.  The  null  hypothesis  of  no  treat¬ 
ment  difference  is  then  tested  against  various  shift  alternatives.  In  this 
way  it  is  possible  to  evaluate  critically  the  influential  characteristics 
inherent  in  the  problem  of  paired  tests.  By  examining  size  and  power, 
we  obtain  the  role  of  the  test  criterion,  significance  level,  experiment 
size,  true  treatment  difference  and  the  underlying  distribution  from 
which  the  basal  yields  are  generated. 

Since  the  parametric  F  test  and  the  Sign  test  have  been  dealt  with 
e  .‘ensively  in  the  literature,  emphasis  was  concentrated  on  the 
performance  of  the  Fisher  and  Wilcoxon  techniques  as  applied  to  paired 
data.  Completely  general  integration  formulas  were  developed  to  enable 
power  computations  to  be  performed  for  experin.ents  involving  three  or 
four  pairs  of  differences,  A  perfect  agreement  of  the  three  non -parametric 
tests  at  the  lowest  achievable  test  size  was  exhibited,  regardless  of  the 


experiment  size,  with  the  correspondence  extending  to  the  three  smallest 
levels  for  the  Fisher  and  Wilcoxon  criteria. 

To  extend  the  investigation  to  larger  experiments  it  was  necessary 
to  perform  an  empirical  study.  With  z  set  of  differences  randomly 
generated  from  various  representative  distributions  and  an  imposed 
treatment  effect  A,  it  was  possible  to  generate  the  totality  of 
conceptual  experiments  that  might  have  arisen.  Each  test  criterion  was 
then  evaluated  for  every  possible  randomization,  and  the  appropriate 
significance  levels  recorded  in  each  case.  In  this  way  exact  power 
probabilities  were  computed  for  each  test  over  the  population  defined  by 
the  randomization  process.  By  performing  these  calculations  for  a 
representative  number  of  samples  of  observed  differences,  an  indication 
of  the  small-sample  behavior  of  the  four  te  its  of  interest  was  established. 
Experiments  of  3,  4,  5,  6  and  8  pairs  were  examined  in  this  manner, 
and  where  theoretical  comparisons  exist  the  results  indicate  excellent 
agreement  with  true  power  values.  Since  the  power  under  experiment 
randomization  does  not  behave  with  a  noticeable  regularity  for  individual 
experiments,  comparisons  of  tests  were  based  on  average  power  values 
determined  from  several  random  samples  of  differences.  Because  of  the 
considerable  computing  time  involved,  various  sampling  techniques  were 
utilized  for  a  limited  investigation  of  the  Fisher  criterion  for  experiments 
involving  ten  differences. 

The  general  conclusion  is  that  with  small  samples  of  differences  from 
any  of  the  distributions  considered,  the  average  powers  of  the  Fisher 
randomization  test  and  the  Wilcoxon  paired  test  are  essentially  identical. 
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The  power  curve  of  the  Sign  test  is  somewhat  inferior  to  that  of  the  other 

tests  at  comparable  sizes  greater  than  — rr-r  .  It  is  also  shown  that 

2  * 

knowledge  of  the  power  of  the  Sign  test  at  the  lowest  achievable  test  size 
is  complete  in  the  sense  that  power  at  all  other  levels  is  uniquely 
related.  The  relative  behavior  of  the  F  test  and  the  non -parametric  tests 
is  somewhat  irregular,  but  in  most  cases  the  power  values  are  quite  clo  ie. 

There  is  evidence  that  departures  from  normality  do  not  drastically  affect 
the  relative  performances  of  the  tests  examined,  but  for  extreme  non- 
normal  configurations  power  is  low  and  erratic  in  its  behavior.  The 
average  size  of  the  F  test  was  generally  quite  close  to  the  nominal  normal 
distribution  size  even  when  the  underlying  distribution  of  differences  was 
decidedly  non-normal.  The  distribution  of  the  size  of  the  F  test  under 
exper  ment  randomization  was  examined  in  some  detail,  and  it  was  found 
that  the  probability  of  detecting  significance  at  level  a  is  distributed  with 
considerable  spread  about  the  true  test  size  a.  The  spread  is  greatly 
dependent  on  the  underlying  distribution  of  differences. 

It  appears  that  except  for  their  inability  to  achieve  any  prechosen  size, 
the  non -parametric  tests  are  to  be  preferred  because  their  behavior  under 
the  null  hypothesis  is  known  a  priori  regardless  of  the  underlying  pattern 
of  basal  yields.  If  one  admits  Fisher's  concept  of  sensitivity  relative  to 
the  problem  of  evaluating  significance,  the  Fisher  randomization  test  is 
slightly  superior  to  the  Wilcoxon  test,  while  both  are  considerably  more 
sensitive  than  the  Sign  test.  In  this  framework  we  look  upon  the 
significance  level  as  a  summary  statistic  giving  the  weight  of  evidence 
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against  a  null  hypothesis  with  reference  to  a  particular  class  of 
alternatives.  For  the  paired  design  we  have  seen  that  the  Fisher 
criterion  includes  more  levels  for  the  declaration  of  significance  than 
the  other  non-parametric  tests.  From  this  point  of  view  the  Sign  te^t 
should  be  recommended  only  when  none  of  the  other  procedures  are 
applicable. 

It  is  evident  that  usage  of  the  Fisher  randomization  test  or  the 
Wilcoxon  paired  test  is  advantageous  to  the  experimenter  when  testing 
two  treatments.  We  have  seen  that  th»  test  criteria  can  be  quickly 
enumerated  over  all  possible  randomizations  when  the  number  of 
observed  differences  is  small.  For  moderate  sample  sizes,  excellent 
approximations  were  observed  by  sampling  a  reasonable  proportion  of 
randomizations. 
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IV.  COMPUTATION  OF  ESTIMATES  OF  VARIANCES  AND  COVARIANCES 
OF  VARIANCE  COMPONENT  ESTIMATES  FROM  FINITE 
BALANCED  POPULATIONS 

INTRODUCTION 

Dayhoff  (1964)  has  shown  that  the  variances  and  covariances  of 
variance  component  estimates  for  certain  simple  balanced  structures 
obtained  in  the  usual  way,  by  equating  the  expected  mean  squares  to  the 
observed  mean  squares  in  the  analysis  of  variance  and  solving  the 
resulting  linear  equations  for  the  variance  components,  can  be  formulated 
as  linear  functions  of  quantities  called  generalized  polykays.  The 
generalized  polykays  are  a  natural  extension  of  the  bipolykays  defined  by 
Hooke  (1956a),  from  which  he  was  able  to  calculate  variances  and 
covariances  of  estimated  variance  components  in  two  factor  crossed 
structures,  as  shown  in  the  papers  by  Hooke  (1954,  and  1956b).  The 
bipolykays  were  an  extension  of  the  polykays  introduced  by  Tukey  (1950, 
and  1956). 

The  generalized  polykays  are,  in  general,  not  directly  computable, 
but  can  be  obtained  as  linear  functions  of  generalized  symmetric  means, 
which  in  the  case  of  polykays  of  degree  four,  are  fourth  moments  of  the 
population  or  sample  quantities.  Because  polykays  and  symmetric  means 
have  the  property  of  inheritance  on  the  average,  it  is  possible  to  obtain 
unbiased  estimates  of  the  variances  and  covariances  of  estimated  variance 
components  by  taking  appropriate  lineai  combinations  of  generalized 
sample  polykays.  The  work  of  Dayhoff  is  thus  complete  for  the  pure 
random  sampling  situation  in  that,  by  his  methods,  one  may  obtain 
formulas  for  unbiased  estimates  of  the  variances  and  covariances  of  the 
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estimated  components  of  variation. 

The  implementation  of  Dayhoff*  s  methods  to  obtain  numerical  estimates 
involves  two  fairly  serious  problems.  First,  the  algebra  required  to 
obtain  the  formulas,  is,  while  straightforward  in  principle,  a  very  tedious 
and  error  prone  process  in  execution.  As  an  example,  for  a  three-factor 
crossed  structure  a  single  variance  formula  involves  thirty-seven  polykays 
of  degree  four  with  coefficients  witich  are  various  functions  of  the  numbers 
of  levels  of  the  factors  in  the  sample  and  in  the  population.  Each  of  these 
polykays  is,  in  turn,  a  linear  function  of  as  many  as  285  generalized 
symmetric  means  of  degree  four.  Because  of  the  heavy  burden  of  algebra 
required  it  seems  expedient  to  perform  this  task  on  high  speed  digital 
computers.  Accordingly,  algorithms  have  been  developed,  which,  when 
presented  with  an  arbitrary  balanced  complete  population  structure, 
obtain  the  necessary  formulas  for  the  variances  and  covariances  of 
variance  components. 

The  second  problem  arises  in  the  numerical  computation  of  the 
generalized  symmetric  means  of  degree  four,  which,  in  Dayhoff  s  method, 
are  the  basic  numerical  quantities  to  be  computed.  It  is  a  rather  simple 
matter  to  write  a  computer  program  to  evaluate  a  single  generalized 
symmetric  mean  from  its  definition  and  not  extremely  difficult  to  write 
a  more  general  program  to  compute  all  the  generalized  symmetric  of 
degree  four  in  a  given  structure.  This  can  be  extended  further,  with  some 
difficulty,  to  compute  the  generalized  symmetric  means  of  degree  four  for 
arbitrary  balanced  complete  structures.  However,  for  relatively  small 
numbers  of  observations  the  number  of  multiplications  becomes  excessive. 


and  a  better  approach  is  necessary.  A  simple  illustration  of  the  approach 
taken  here  is  given  by  the  familiar  identity  below 
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The  left  hand  quantity  is  a  simple  symmetric  mean  of  degree  two  and, 
aside  from  the  divisor  requires  n(n-l)  /  2  multiplications  and  additions, 
while  the  expression  on  the  right  requires  n  +  1  multiplications  and 
2n  +  1  additions.  Similar  identities  may  be  obtained  for  generalized 
symmetric  means  of  degree  four,  and  these  result  in  important  savings 
in  the  amount  of  computations  required.  These  identities  do,  of  course, 
increase  the  amount  of  algebra  required,  and  care  must  be  taken  that  one 
does  not  exchange  the  problem  of  performing  an  impossibly  large  number 
of  multiplications  for  the  problem  of  collecting  an  impossibly  large  number 
of  coefficients. 

A  general  method  of  obtaining  all  the  needed  identities  in  a  straight¬ 
forward  way  has  been  developed  and  implemented  in  a  computer  program, 
so  that  the  generalized  symmetric  means  are  formulated  in  terms  of 
quantities  which  are  computable  in  a  minimum  number  of  operations. 

These  quantities  are  called  D's  or  "derived  terms."  The  same 
quantities,  for  the  particular  case  of  two  factor  crossed  structures,  are 
used  by  Hooke  (1954).  Further  programs  have  been  developed  which 
interpret  the  D's  and  compute  their  numerical  values. 
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DAYHOFF'S  PROCEDURE 


The  theoretical  basis  for  the  computations,  as  developed  by  Dayhoff 
(1964)  are  as  follows. 

Variance  components  estimates  may  be  considered  as  linear  combina¬ 
tions  of  sample  cap  sigmas.  The  cap  sigmas  are  in  fact  the  same 
quantities  as  generalized  polykays  of  degree  two,  so  that  variances  and 
covariances  of  variance  component  estimates  are  linear  functions  of  the 
variances  and  covariances  of  sample  generalized  polykays  of  degree  two. 
Variances  and  covariances  of  sample  generalized  polykays  of  degree  two 
are  linear  combinations  of  population  generalized  polykays  of  degree  four, 
and  unbiased  estimates  of  these  are  given  by  the  corresponding  sample 
polykays  of  degree  four.  The  generalized  polykays  of  degree  four  are 
linear  functions  of  the  generalized  symmetric  means  of  degree  four,  which 
can  be  computed. 

POLYKAYS  AS  FUNCTIONS  OF  SYMMETRIC  MEANS 

The  generalized  polykays  for  a  crossed  structure  are  defined  as 
functions  of  simple  polykays  by  means  of  symbolic  multiplication.  Thus 
let  P  =  (a/p)  denote  a  generalized  polykay  of  degree  four  for  two  factors. 
The  a  and  p  symbols  may  be  considered  as  indicating  a  partition  of  the 
subscripts  into  classes  which  are  equal  for  each  element  of  the  fourth 
degree  product  of  the  "leading"  symmetric  mean  in  the  definition  of  (a). 
We  make  use  of  the  notation,  introduced  by  Dayhoff,  of  giving  a  symbol 
for  each  element  of  the  product  with  the  equality  of  these  symbols 
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indicating  equality  of  the  subscripts.  Thus  to  denote  a  product  y^  y^f  , 

write  0001,  while  y^y^y^, is  denoted  by  0123.  If  one  uses  primes 
to  indicate  restrictions  on  the  subscripts,  then  the  symbols  a,  p  etc. 
can  be  considered  simply  as  a  list  of  the  number  of  primes  on  the 
successive  y’s  for  the  first,  second,  etc.  factors. 

These  lists  can  be  considered  as  partitions  of  the  integer  four,  with 
a  further  order  restriction;  we  will  call  them  "ordered  partitions." 

The  ordering  consideration  is  not  necessary  when  simple  polykays  are 
considered,  but  when  more  than  one  factor  is  considered  the  ordering 
becomes  necessary  so  that  the  relationships  of  the  restrictions  for  the 
various  factors  will  be  preserved. 

The  simple  polykays  may  be  expressed  as  linear  combinations  of 
simple  symmetric  means,  so  that 

(a)  =  Z  a.  <a.> 

The  generalized  polykays  for  completely  crossed  structure  are  defined  by 
a  symbolic  multiplication 

P  =  (a/p)  =  (a)  (x)  (p) 

=  (Za  <a»  ®  (Zb  <p>  ) 

i  j  J  J 

=  ZZa.b  <a.>®  <p> 

i  j 

=  Z  Z  a.  b.  <a./p.> 
i  j  i  J  x  i'  \T 
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or,  say 
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Unfortunately  the  notation  described  above  is  redundant  in  that  we 
may  have  (a/p)  =  (a'/P1)  with  a  /  a'  ,  or  P  /  P'  or  both.  No 
simple  notation  has  been  discovered  for  removing  this  redundancy, 
although  an  algorithm  has  been  developed  to  give  a  many-to-ont  mapping 
of  all  the  possible  symbols  for  a  given  set  of  generalized  polykays  into  a 
set  of  distinct  ones.  Carrying  out  the  symbolic  multiplication,  combining 
like  terms  and  collecting  coefficients  is  all  that  is  necessary  in  obtaining 
the  polykays  and  their  formulas  in  terms  of  generalized  symmetric  means 
for  crossed  structures.  The  handling  of  arDitrary  structures  require  a 
few  additional  operations. 

SYMMETRIC  MEANS  AS  FUNCTIONS  O'-'  D's 

Th-  symbolic  multiplication  is  also  applicable  in  obtaining  the 
expansions  of  the  generalized  symmetric  means  in  terms  of  the  D's. 

Let  N  =  n  n_  denote  the  divisor  of  the  generalized  symmetric  mean 
ap  a  p 

<a/p>  .  (If  A  is  the  number  of  levels  of  the  first  factor,  then 

n  =  A(A-l)*  •  •  (A  -  r  +  1)  where  r  is  the  number  of  different  symbols 
<1 

in  the  list  a.)  Let  denote  a  simple  symmetric  mean.  Then  there 

exists  a  formula 

<-i>  =  t-  l  dik  KJ 

a.  k 
1 

where  Ja-^j  denote  the  "D"  quantities  for  a  single  subscript.  For 
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example,  consider 

<a.>  =  <0012>  = - 1 
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N 
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It  can  be  shown  that  if  <a./p.>  is  a  generalized  symmetric  mean  for 
a  crossed  structure,  and  =  — i-  2d  ia  i 
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Combining  the  two  symbolic  multiplications  gives 


P  =  (a/p)  =  (a)  Q  (p) 


=  Z 


£2aibj  <ai/Pj>  *  S£aibj  <ai>@<Pj> 

i  J  J  1  i  J  J  i 


a.b. 


=  Z 


2  2  2  2  dik  dji  W  ®  M 

i  j  a.  Pj  k  i  J  J 


a.b.d.,  d. . 
=  Z  Z  Z  Z  — 1  ■■J-i.jLJ./. 


*  *  t  *  ^  n  q 

x  j  k  i  a.  p. 


Kk^jil  °r  P=  2wsDs*  8aY- 


Thus  the  crossed  structure  polykays  can  be  evaluated  as  linear  functions 
of  the  D's  ,  and  the  proper  linear  functions  are  determined  by  successive 
symbolic  multiplications.  When  dealing  with  a  structure  containing  some 
factors  nested  in  others  the  above  operations  are  modified  in  the  following 
ways:  (1)  Some  of  the  polykays  of  the  crossed  structure  do  not  exist  in 
the  nested  structure.  These  are  eliminated.  (2)  In  performing  the  first 
symbolic  multiplication  those  polykays  of  the  crossed  structures  which 
do  not  exist  in  the  nested  structure  are  mapped  into  other  polykays  of  the 
nested  structure  and  the  terms  collected.  This  procedure  gives  the 
proper  formulas  for  the  polykays  of  the  nested  structure  in  terms  of  the 
generalized  symmetric  rheans  for  the  nested  structure.  (3)  Before 
performing  the  second  symbolic  multiplication,  some  of  the  terms  in  the 
expansion  for  nested  polykays  are  eliminated  in  a  systematic  way  depending 
upon  the  terms  in  the  expansion  for  nesting  factors.  This  procedure  gives 
the  correct  formulas  for  nested  structures. 
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COMPUTER  PROGRAMS 


The  programming  system  for  obtaining  estimated  variances  and 
covariance  of  variance  component  estimates  consists  of  the  following. 

(1)  PFORM  -  A  program  to  generate  the  polykays  for  a  given  structure 
and  obtain  the  formulas  for  these  polykays  as  linear  functions  of  the 
generalized  symmetric  means.  This  portion  of  the  system  is  run 
separately  since  the  formulas  depend  only  upon  the  structure  and  not 

the  particular  data  being  analyzed.  The  major  subroutines  of  this  program 
are: 

a.  SYMPY  -  A  routine  for  symbolic  multiplication 

b.  UNREP  -  A  routine  which  maps  the  various  representations 
of  a  given  polykay  into  a  unique  representation. 

(2)  DCOMP  -  A  routine  which  computes  numerically  all  the  D' s  for  a 
given  structure  and  sample.  This  program  consists  of  two  major  portions; 
one  to  interpret  the  symbolic  representation  of  a  D  and  generate  certain 
tables  which  determine  the  base  addresses,  powers,  operations,  and 
sequence  of  operations  required  to  compute  the  particular  value  symbolized, 
and  a  second  program  to  follow  this  sequence  of  operations  and  obtain  the 
desired  numerical  quantity. 

(3}  DVCMP  -  A  routine  to  compute  the  divisors  for  the  generalized 
symmetric  means. 

(4)  GCOMP  -  A  program  which  performs  the  symbolic  multiplication  to 
obtain  the  expansions  for  generalized  syqnmetric  means  in  terms  of  the  D 
quantities  and  uses  the  D's  from  DCOMP,  and  the  divisors  from 
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DVCMP  to  evaluate  the  generalized  symmetric  means. 


(5)  PCOMP  -  Reads  the  formulas  for  generalized  polykays  in  terms  of 
generalized  symmetric  means  (i.  e. ,  the  output  of  PFORM)  and  evaluates 
these  formulas  using  the  values  of  the  generalized  symmetric  mean 
computed  by  CrCOMP. 

(6)  VCVC  -  This  program  performs  a  variety  of  tasks,  largely  algebraic 
in  nature  in  obtaining  and  evaluating  the  formulas  for  the  variances  and 
covariances  of  variance  components  for  the  particular  structure  in  terms 
of  the  polykays  of  degree  four  which  have  been  previously  computed  by 
PCOMP.  Included  are  the  following  operations: 

a.  The  complete  model  for  the  present  structure  and  the 
corresponding  completely  crossed  model  are  generated.  This 
provides  a  symbolic  list  of  all  the  variance  components  and 
cap  sigmas  needed. 

b.  The  formulas  for  the  variances  and  covariances  of  the  crossed 
polykays  of  degree  two  in  terms  of  crossed  polykays  of  degree 
four  are  generated  by  symbolic  multiplication  of  the  formulas 
for  multiplication  of  simple  polykays  of  degree  two.  These 
formulas  are  then  evaluated  for  the  polykays  of  the  present 
structure  to  give  the  numerical  values  of  the  estimated  variances 
and  covariances  of  crossed  cap  sigmas  for  the  present  structure. 

c.  The  model  terms  for  the  present  structure  are  expanded  in  the 
terms  of  a  completely  crossed  structure  to  give  the  formulas  for 
the  cap  sigmas  of  the  present  structure  as  sums  of  crossed  cap 
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sigmas.  This  transformation  is  then  applied  to  the  variance- 
covariance  matrix  of  crossed  cap  sigmas  to  give  the  variance - 
covariance  matrix  of  the  cap  sigmas  of  the  current  structure. 

d.  The  tranformation  for  cap  sigmas  in  terms  of  variance 

components  are  generated,  inverted,  and  applied  to  the  variance- 
covariance  matrix  of  the  cap  sigmas  to  give  the  estimated 
variance -covariance  matrix  for  the  components  of  variance. 

USE  OF  THE  SYSTEM 

Thus  far  the  computations  with  the  system  have  been  made  with 
rather  special  data  for  the  purpose  of  checking  the  computer  programs. 

It  is  planned  to  use  the  system  to  investigate  the  variances  and 
covariances  of  realistic  populations  and  samples  and  to  compare  the 
results  with  those  obtained  under  infinite  model  assumptions. 

The  algorithms  are  designed  to  operate  for  any  number  of  factors. 
However,  the  present  program  is  limited  to  3  factors  so  that  various 
arrays  need  not  exceed  the  storage  capacity  available  with  the  IBM  7074 
FORTRAN  Operating  System  which  allows  about  8000  ten  digit  words 
for  program  and  data.  It  would  not  be  very  difficult  to  expand  the  program 
to  4  or  5  factors  with  the  present  20,  000  word  IBM  7074  equipment, 
but  this  is  not  contemplated  at  the  present  time  because  this  equipment 
will  be  replaced  in  the  near  future. 

Thus  far  the  computations  have  not  proved  too  costly.  For  example, 
with  a  four  by  four  crossed  sample  the  complete  computation  required  in 
th-J  neighborhood  of  28  seconds.  (This  included  some  extra  operations 
required  for  generating  the  data).  With  two  or  thr  ie  factors  good  sized 
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samples  {say  1000  observations)  can  probably  be  computed  in  a  matter 
of  a  few  minutes,  say  1  to  5  minutes,  depending  on  the  model.  By 
comparison,  a  program  for  computing  the  generalized  symmetric  means 
directly  would  require  in  the  neighborhood  of  1000  hours  with  the  same 
computing  equipment. 

One  familiar  with  large  scale  numerical  computations  will  recognize 
that  the  type  of  computations  described  above  may  lead  to  serious 
truncation  errors.  This  is  indeed  true,  but  can  be  countered  to  a  large 
exfent  by  the  use  of  double  precision  arithmetic  at  selected  points  in  the 
algorithm  and  by  standardizing  the  observations.  In  summary  it  seems 
fair  to  claim  that  the  systems  described  provides  a  practical  method  for 
obtaining  unbiased  estimates  of  variances  and  covariances  of  variance 
components  for  finite  balanced  complete  structure  when  few  factors  are 
involved,  and,  while  such  computation  may  be  of  little  importance  for  any 
particular  data  set,  they  are  of  some  importance  in  the  investigation  of 
the  properties  of  variance  component  estimates  in  general. 
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V.  OTHER  TOPICS 

A.  THE  DESIGN  OF  EXPERIMENTS 

A  review  of  developments  in  the  design  of  experiments  over  the  past 
ten  years  was  prepared  and  presented  at  the  Tenth  Conference  on  the 
Design  of  Experiments  in  Army  Research,  Development  and  Testing 
(Kempthorne,  1965a).  The  problems  ol  inference  from  experiments  is 
touched  only  briefly,  and  the  main  area  reviewed  is  the  design  and 
analysis  of  investigations  in  multifactorial  situations.  The  sequence  of 
developments  with  regard  to  qualitative  factors  is  outlined,  from  the 
testing  of  the  full  factorial  set,  to  the  Fisher  plans  for  2n-  1  factors 
each  at  2  levels  in  2°  observations,  the  Plackett-Burman  plans  for 
4N-1  factors  at  2  levels  in  4N  observations,  and  then  the  development 
of  fractional  replication  by  several  workers.  In  the  case  of  continuous  or 
quantitative  factors,  the  developments  are  reviewed  with  regard  to 
optimum  seeking.  The  work  of  Box  and  Wilson,  and  the  PARTAN  method 
which  are  essentially  strategies  based  on  assumption  of  ellipsoidality  of 
contours  without  sizeable  error  variation  are  discussed,  as  is  the  work  of 
Kiefer  and  Wolfowitz  and  others  which  is  concerned  with  proving 
convergence  with  probability  one  whatever  the  amount  of  error  present. 
Work  on  the  general  problem  of  exploring  the  relationship  between  control 
variables,  such  as  temperature  and  pressure,  and  yield  is  discussed. 

The  plans  developed  by  Box  and  his  co-workers  are  discussed  particularly 
with  reference  to  the  problem  of  scaling  of  variables.  In  contrast  to  this 
line  of  work  is  that  of  Kiefer  and  Wolfowitz  who  make  a  direct  attack  on 
design  to  achieve  optimality  with  regard  to  a  completely  defined  aspect  of 
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the  investigation.  It  appears  that  this  approach  is  informative,  but  not 
decisive,  because  an  experimental  investigation  rarely  has  a  single 
criterion  of  value  and  it  is  usually  the  case  that  a  design  which  is  near 
optimal  with  respect  to  one  reasonable  criterion  of  value  is  quite  non- 
optimal  with  respect  to  other  criteria  of  value  which  the  experimenter 
must  consider.  It  would  appear  then  that  at  best  the  problem  of  design 
can  be  formulated  in  programming  terms,  that  is,  one  would  like 
optimality  with  respect  to  one  criterion  with  a  reasonable  fiegree  of  sub¬ 
optimality  with  respect  to  other  criteria.  This  type  of  approach  to  design 
is  being  explored  currently. 

B.  MULTIVARIATE  RESPONSES  IN  EXPERIMENTS 

A  review  of  the  status  of  procedures  for  data  interpretation  and 
inference  for  the  case  of  multivariate  responses  in  comparative 
experiments  was  presented  to  the  International  Symposium  on  Multivariate 
Analysis  (Kempthorne,  1965b).  The  view  is  expressed  and  substantiated, 
partially  at  least,  that  the  theoretical  work  in  multivariate  analysis  has 
so  far  led  to  quite  meager  results  with  regard  to  the  drawing  of 
experimental  conclusions.  A  dichotomy  is  drawn  between  experiments 
the  purpose  of  which  is  to  make  terminal  decisions,  such  as  the  naming 
of  the  "best"  treatment,  and  experiments  performed  to  add  to  knowledge. 
The  obvious  names  for  these  are  "decision"  experiments  and  "information" 
experiments.  It  appears  that  the  great  bulk  of  theoretical  work  is  aimed 
at  "decision"  experiments,  and  that  the  improvement  of  data  procedures 
for  "information"  experiments  has  been  disappointingly  small.  Some 
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discussion  is  given  of  the  rival  modern  "religions"  of  statistics,  which 
are  associated  with  the  words,  "Bayesian"  ,  "decision"  and  "likelihood." 

An  assessment  of  what  scientists  want  from  the  comparative 
experiment  with  multivariate  response  is  made,  and  related  to  the 
current  availability  of  techniques.  It  is  concluded  that  the  situation  is 
deplorable.  The  conclusions  of  the  review  are  as  follows. 

(1)  The  purpose  of  statistical  analysis  of  experimental  informational 
data  is  to  form  opinions  about  the  underlying  situation.  One  can  certainly 
form  opinions  on  the  basis  of  univariate  techniques,  which  are 
communicable  and  fairly  easily  understood.  The  question  of  what 
multivariate  analysis  can  provide  over  and  above  separate  univariate 
analyses  has  an  obvious  answer  at  an  elementary  level,  as  in  the  study 
of  the  error  matrix,  but  is  unanswered  beyond  this.  It  is  relevant,  for 
instance,  to  ask  why  one  would  get  significance  at  a  particular  level  by 
correlated  univariate  tests  and  not  by  the  corresponding  multivariate 
test.  An  observation  that  this  happens  is  in  itself,  informative  of  the 
situation  under  analysis  and  requires  examination  of  the  data  to  see  "why" 
it  happened.  There  are,  however,  situations  in  which  the  multivariate 
analysis  tells  one  something  about  individual  components  of  the  observation 
vector.  Suppose  one  observed  the  following  in  a  completely  randomized 
design; 


Mean  squares  and  products 


X* 

X1 

xlx2 

X2 

X2 

T  reatments 

500 

250 

190 

Residual 

100 

75 

200 

The  data  indicate  differences  among  treatments  with  regard  to  ,  but 
not  with  regard  to  x^  ,  if  one  looks  at  the  univariate  analyses.  But  the 
product  analysis  indicates  that  there  are  differences  between  treatments 
wi£h  regard  to  both  Xj  and  x^  .  Exactly  how  one  can  quantify  these 
indications  appears  to  be  unknown,  but  the  data  illustrate  how  the  multi* 
variate  analysis  "tells"  one  more  about  one  component  of  the  observation 
than  a  single  univariate  analysis. 

(2)  The  state  of  theoretical  knowledge  about  multivariate  observations, 
in  spite  of  very  good  books  on  the  subject,  seems  still  very  primitive. 
Naturally  enough,  the  theory  is  dominated  by  the  multivariate  normal 
distribution,  but  one  wonders  how  robust  the  procedures  for  assessing 
differences  of  means  are.  This  will  probably  have  to  be  assessed  by 
Monte  Carlo  computations.  We  have  some  obviously  desirable  tests  of 
significance  for  global  questions,  but  have  very  few  informative 
multivariate  data  dissection  procedures. 

(3)  The  future  of  data  analysis  obviously  lies  in-the  easy  use  of  high 
speed  computers.  The  only  way  "to  look  at"  multivariate  data  is  by 
means  of  computers  and  plotters.  Even  in  the  present,  after  20  years 
of  modern  computation,  the  problems  of  communicating  with  a  computer 
are  excessive.  Hopefully  these  problems  will  be  solved  soon,  and  we  will 
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have  manuals  of  data  analysis  just  like  manuals  of  chemical  analysis. 

The  presently  low  amount  of  truly  multivariate  analysis  is  certainly 
partly  due  to  inadequacy  of  computing  processes. 

(4)  Many  of  our  univariate  procedures  arose  from  looking  at  real  data 
and  trying  to  make  sense  of  them.  The  same  will  hold  for  multivariate 
data.  The  job  of  thinking  of  ways  of  looking  at  data  is  different  from  the 
job  of  determining  the  probability  behavior  of  these  ways. 

(5)  Even  though  the  usual  multivariate  techniques  seem  from  some  points 
of  view  to  assess  the  totality  of  the  data,  they  do  so  only  with  regard  to 
linear  functions  of  the  observations.  Ratios  and  other  indices  constructed 
from  the  components  may  well  behave  in  a  simple  way. 

C.  EXPERIMENTAL  INFERENCE 

The  Fisher  Memorial  Lecture  sponsored  by  the  American  Statistical 
Association,  the  Institute  of  Mathematical  Statistics  and  the  Biometric 
Society  was  presented  on  the  topic  of  experimental  inference 
(Kempthorne,  1965c). 

The  dichotomy  presented  by  Fisher's  writings  into  experimental  and 
non-experimental  inference  is  discussed,  and  the  paper  first  discusses 
the  more  basic  of  Fisher's  ideas  on  non-experimental  inference.  These 
are  considered  to  be  (a)  tests  of  significance,  (b)  the  use  of  the 
likelihood  function,  and  (c)  fiducial  probability.  Fundamental  obscurities 
with  regard  to  tests  of  significance  are  discussed.  The  use  of  the 
likelihood  function  is  examined,  and  it  is  concluded  that  much  of  the 
theoretical  work  on  likelihood  is  at  best  misleading  and  at  worst  utterly 
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erroneous.  The  basis  for  this  view  is  that  an  intrinsic  aspect  of  data 
collection  is  the  fact  that  a  continuously  distributed  random  variable  is 
observable  only  with  a  definite  grouping  error,  specified  by  the  observer, 
and  that  observations  of  unlimited  accuracy  are  impossible.  This  fact  is 
surely  incontrovertible,  but  much  of  the  mathematical  theory  now  avail¬ 
able  and  being  presented  to  students  assumes  the  contrary.  Some 
simple  consequences  of  this  fact  are  substantiated  in  the  paper: 

(1)  the  likelihood  is  not  the  product  of  probability  densities,  but 
is  always  a  multinomial  likelihood,  which  may  or  may  not  be 
approximated  reasonably  by  the  former. 

(2)  the  likelihood  properly  calculated  does  not  "blow  up"  ,  that  is, 
become  infinitely  large  as  certain  values  for  the  parameters 
are  approached,  a  "fact"  which  has  been  stated  by  many 
research  workers. 

(3)  the  numerous  examples  in  the  literature  on  estimates  with 
variances  asymptotically  of  the  form  K/n2  are  erroneous. 

Some  views  with  regard  to  fiducial  probability  are  given. 

The  bulk  of  the  paper  consists  of  a  discussion  of  the  concepts  "validity 
of  error"  ,  "validity  of  test"  in  experiments,  and  it  is  concluded  that 
Fisher's  writings  are  essentially  consistent  with  regard  to  these,  in  that 
validity  has  to  be  judged  with  reference  to  the  population  of  repetitions 
induced  by  the  physical  randomization  employed. 

A  short  review  is  made  of  investigations  on  the  performance  in  the 
randomization  framework  of  some  tests  of  significance  for  the  paired 
design. 


38 


REFERENCES 


Basson,  R.  P.  (1965).  On  unbiased  estimation  in  variance  component 
models.  Unpublished  Ph.  D.  Thesis.  Library,  Iowa  State 
University  of  Science  and  Technology,  Ames,  Iowa. 

Dayhoff,  E.  E.  (1964).  Generalized  polykays  and  application  to  obtaining 
variances  and  covariances  of  components  of  variation. 
Unpublished  Ph.  D.  Thesis.  Library,  Iowa  State  University  of 
Science  and  Technology,  Ames,  Iowa. 

Doerfler,  T.  E.  (1965).  Size  and  power  of  some  tests  under  experimental 
randomization.  Unpublished  Ph.  D.  Thesis.  Library,  Iowa 
State  University  of  Science  and  Technology,  Ames,  Iowa. 

Graybill,  F.A.  and  R.  A.  Hultquist.  (1961).  Theorems  concerning 
Eisenhart's  Model  II.  Ann.  Math.  Stat.  32:  261-269. 

Henderson,  C.R.  (1953).  Estimation  of  variance  and  covariance 
components.  Biometrics  9:  226-252. 

Hooke,  R.  W.  (1954).  The  estimation  of  polykays  in  the  analysis  of 
variance.  Stat.  Res.  Group,  Princeton  University. 

Memorandum  Report  56. 

Hooke,  R.  W.  (1956a).  Symmetric  functions  of  a  two-way  array.  Ann. 
Math.  Stat.  27:  55-79. 

Hooke,  R.  W.  (1956b).  Some  applications  of  bipolykays  to  the  estimation 
of  variance  components  and  their  moments.  Ann.  Math.  Stat. 

27:  80-98. 

Kempthorne,  O.  (1965a).  Development  of  the  design  of  experiments  over 
the  past  ten  years.  Presented  at  the  Tenth  Conference  on  the 
Design  of  Experiments  in  Army  Research,  Developments  and 
Testing,  Washington,  D,  C. 

Kempthorne,  O.  (1965b).  Multivariate  responses  in  comparative 

experiments.  Presented  at  the  International  Symposium  on 
Multivariate  Analysis,  Dayton,  Ohio. 

Kempthorne,  O.  (1965c).  Some  aspects  of  experimental  inference. 

The  Fisher  Memorial  lecture  presented  at  the  joint  meetings  of 
the  American  Statistical  Association,  the  Institute  of  the 
Mathematical  Society,  and  the  Biometric  Society,  Philadelphia, 
Pennsylvania. 


39 


Sprott,  D.  W.  (1956).  A  note  on  combined  interblock  and  intrablock 
estimation  in  incomplete  block  designs.  Ann.  Math.  Stat. 

27:  633-641. 

Tukey,  John  W.  (1950).  Some  sampling  simplified.  J.  Am.  Stat. 

Assoc.  45:  501-519. 

Tukey,  John  W.  (1956).  Keeping  moment  -  like  computations  simple. 
Ann.  Math.  Stat.  27:  37-54. 

Zyskind,  G.  ,  O.  Kempthorne,  R.  F.  White,  E.E.  Dayhoff,  and  T.  E. 

Doerfler.  (1964).  Research  on  analysis  of  variance  and 
related  topics.  ARL  report  193,  Office  of  Aerospace  Research, 
United  States  Air  Force,  Wright-Patter son  Air  Force  Base, 
Ohio, 


40 


LIST  OF  ACTIVITIES  ASSOCIA'.  £D  WITH  CONTRACT  AF  3  3(61  5)-  17  37 
DURING  THE  PERIOD  JULY,  1964  THROUGH  DECEMBER,  1965 


Speeches 


O.  Kempthorne 


"The  Current  Status  of  the  Design  of  Experiments." 

(Annual  meetings  of  the  Institute  of  Mathematical  Statistics  - 
August,  1964  -  Amherst,  Massachusetts). 

"Development  of  the  Design  of  Experiments  over  the  Past  Ten 
Years."  (Tenth  Conference  on  the  Design  of  Experiments  in 
Army  Research,  Development  and  Testing  -  November,  1964  - 
Washington,  D.  C.  ). 

"Multivariate  Responses  in  Comparative  Experiments.  " 
(International  Symposium  on  Multivariate  Analysis  -  June, 

1965  -  Dayton,  Ohio). 

"Some  Aspects  of  Experimental  Inference.  " 

(Joint  meetings  of  American  Statistical  Association,  Biometric 
Society  and  Institute  of  Mathematical  Statistics  -  September, 
1965  -  Philadelphia,  Pennsylvania). 

O.  Kempthorne  -  Discussion  leader  -  "Statistical  Methods  for  Data 
Analysis.  "  (Joint  meeting  of  American  Statistical  Association, 
Biometric  Society,  and  Institute  of  Mathematical  Statistics  - 
September,  1965  -  Philadelphia,  Pennsylvania). 


George  Zyskind  and  Frank  Martin 


"On  Simple  Combination  of  Information  from  Uncorrelated  Linear 
Sources.  "  (Institute  of  Mathematical  Statistics  -  Purdue 
University  -  March,  1966). 


George  Zyskind  -  Discussion  leader  -  "Problems  in  the  Analysis  of  the 
General  Comparative  Experiments.  "  (Joint  meeting  of  American 
Statistical  Association,  Biometric  Society,  and  Institute  of 
Mathematical  Stat:  ifics  -  September,  1965  -  Philadelphia, 
Pennsylvania). 


E. J.  Carney 

"The  Lattice  of  Ordered  Partitions  and  its  Relation  to  Generalized 
Symmetric  Means  and  Generalized  Polykays."  (Joint  meeting  of 
Biometric  Society,  Institute  of  Mathematical  Statistics,  and 
American  Statistical  Association  -  Brookhaven  National  I^aborator 
Upton,  L.  I.  ,  New  York  -  April,  1966). 


Publications 


Basson,  Rodney  -  On  unbiased  estimation  in  variance  component  models. 
Aerospace  Research  Laboratories,  Wright  Patterson  Air 
Force  Base,  Ohio.  ARL  report  submitted. 

Carney,  E.  J.  -  June,  1966  -  The  lattice  of  ordered  partitions  and  its 
relation  to  generalized  symmetric  means  and  generalized 
polykays.  Ann.  Math.  Stat.  37:  761-762.  (Abstract). 

Dayhoff,  E.  -  December,  1964  -  On  the  equivalence  of  polykays  of  the 
second  degree  and  E*s.  Ann.  Math.  Stat.  35:  1663-1672. 

Dayhoff,  E.  -  February,  1966  -  Generalized  polykays,  an  extension  of 
simple  polykays  and  bipolykays.  Ann.  Math.  Stat.  37: 

226-241. 

Kempthorne,  O.  -  March,  '966  -  Some  aspects  of  experimental  inference. 

J.  Am.  Stat.  Assoc.  61:  11-34. 

Kempthorne,  O.  -  October,  1965  -  Development  of  the  design  of  experiments 
over  the  past  ten  years.  Proceedings  of  the  Tenth  Conference  on 
the  Design  of  Experiments  in  Army  Research  Development  and 
Testing  -  AROD  Report  65-3:  19-46. 

Kempthorne,  O.  -  Multivariate  responses  in  comparative  experiments. 

Proceedings  of  the  Multivariate  Symposium,  Dayton,  Ohio. 

To  be  published. 

Martin,  F.  and  G.  Zyskind  -  Octooer,  1966  -  On  combmability  of 

information  from  uncorrelated  linear  models  by  simple  weighting. 
Ann.  Math.  Stat.  37:  1338-1347. 

Zyskind,  G.  ,  O.  Kempthorne,  R.  F.  White,  E.  E.  Dayhoff,  and  T.E. 

Doerfler.  -  November,  1%4  -  Research  on  analysis  of  variance 
and  related  topics.  Aerospace  Research  Laboratories,  Wright 
Patterson  Air  Force  Base,  Ohio.  ARL64-193. 


42 


4 


Theses 

Basson,  Rodney  -  (Under  G.  Zyskind) 

"On  unbiased  estimation  in  variance  component  models."  Ph.  D. 

Dayhoff,  E.E.  -  (Under  O.  Kempthorne) 

"Generalized  polykays  and  application  to  obtaining  variances  and 
covariances  of  components  of  variation."  Ph.  D. 

Doerfler,  T.  E.  -  (Under  O.  Kempthorne) 

"Size  and  power  of  some  tests  under  experimental  randomization." 
Ph.  D. 

Martin,  Frank  B.  -  (Under  G.  Zyskind) 

"On  simple  linear  combinability  of  information  from  independent 
sources. "  M.  S. 

Seminars  -  Iowa  State  University 
O.  Kempthorne 

November,  1964  -  Some  developments  in  the  design  of  experiments 
over  the  past  ten  years, 

September,  1965  -  Comparison  of  tests  for  the  paired  design. 

G.  Zyskind 

March,  1965  -  On  the  invariance  of  a  class  of  canonical  forms 
arising  in  the  analysis  of  experimental  structures. 

F.  Martin 

September,  1965  -  On  simple  linear  combinability  of  information  from 
independent  sources. 

E.  Carney 

March,  1966  -  The  lattice  of  ordered  partitions  and  computation  of 
generalized  polykays. 


43 


_ Unclassified _ 

_ Security  CU«»ific«tion _ 

DOCUMENT  CONTROL  DATA  •  R&D 

(Sorority  e  loooi  fteation  of  till a,  body  at  abstract  and  tndoning  annotation  i*h iat  bo  on  to  rod  mdion  tho  ovoroll  roport  la  c  loot  Iliad) 

I  OmClNATINC  ACTIVITY  (Corpora to  author)  20  flKPONT  tlCUNI  Tv  C  LAMIPlCATlON 

Statistical  Laboratory  Unclassified 

Iowa  State  University  of  Science  and  Technology  TITTmoUi  — 

Ames,  Iowa _ ' _ _ 

t  r«rort  titli 

Research  on  Analysis  of  Variance  and  Data  Interpretation 

4  DCICftlPTIVf  NOTH  (Typo  ol  roport  and  Inctuoiro  datoo) 

Scientific. _ Interim.  _ I  July  1964  -  31  December  1965 _ 

f  AUTHORS;  (Loot  noma,  tint  noma.  Initial) 

Kempthorne,  O.  ,  Zyskind,  G. ,  Basson,  R.  P. ,  Martin,  F.  B. ,  Doerfler,  T. 
E. ,  Carney,  E.  J. 


«.  rirort  dat* 

December  1966 

t*  CONTRACT  on  8NANT  NO. 


AF  33(615)-1737 


k  PMOJICT  NO  707  1 

*  61445014 

4.  681304 _ 

10  *  V*  IL  ARIUTY/LIMITATION  NOTICM 


|  It  ONI  A  IN  A  TON'S  »IPO»T  NUMAKNfJJ 


t*  gTHONjNJNONT  NorJ;  (Anr  •» NIMten  «•>  m*r  »«  loirw 

ARL  66-0240 


1.  Distribution  of  this  document  is  unlimited. 


1!  1URNL  CMCMTARY  NOTH 


II  ABSTRACT 
v  •  n  „  o 


It.  iSONIONINO  MILITARY  ACTIVITY 

Aerospace  Research  Laboratories  (ARM) 
Office  of  Aerospace  Research,  USAF 
Wright- Patterson  AFB,  Ohio 


'^'Research  on  Analysis  of  Variance  and  Data  Interpretation  is  described. 
Section  1  discusses  estimation  problems  in  variance  component  and  mixed  model 
problems.  Section  II  considers  the  combination  of  information  on  estimable 
functions  from  distinct  uncorrelated  sources  and  jus  ifies  some  of  the  common 
applications  in  experimental  design  problems.  Section  III  discusses  siz-!  and 
power  under  experiment  randomization  of  several  competitive  tests  for  the  pair¬ 
ed  design  and  presents  conclusions  about  the  high  relative  merits  of  the  variance 
ratio  randomization  and  the  Wilcoxon  tests.  Section  IV  discusses  the  develop¬ 
ment  of  high  speed  computational  methods  for  the  calculation  of  fourth  degree 
generalized  polykays  of  variances  and  covariances  of  estimated  variance  com¬ 
ponents  for  balanced  samples  from  balanced  populations.  Section  V  summarizes 
briefly  papers  on  the  design  of  experiments  and  multivariate  responses  in 
experiments  and  the  1965  Fisher  Memorial  lecture  on  experimental  inference. 


